Comparison of Classification Methods Based on the Type of Attributes and Sample Size

نویسندگان

  • Reza Entezari-Maleki
  • Arash Rezaei
  • Behrouz Minaei-Bidgoli
چکیده

In this paper, the efficacy of seven data classification methods; Decision Tree (DT), k-Nearest Neighbor (k-NN), Logistic Regression (LogR), Naïve Bayes (NB), C4.5, Support Vector Machine (SVM) and Linear Classifier (LC) with regard to the Area Under Curve (AUC) metric have been compared. The effects of parameters including size of the dataset, kind of the independent attributes, and the number of the discrete and continuous attributes have been investigated. Based on the results, it can be concluded that in the datasets with few numbers of records, the AUC become deviated and the comparison between classifiers may not do correctly. When the number of the records and the number of the attributes in each record are increased, the results become more stable. Four classifiers DT, k-NN, SVM and C4.5 obtain higher AUC than three classifiers LogR, NB and LC. Among these four classifiers, C4.5 provides higher AUC in the most cases. As a comparison among three classifiers LogR, NB and LC, it can be said that NB provides the best AUC among them and classifiers LogR and NB have the same results, approximately.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Forest Stand Types Classification Using Tree-Based Algorithms and SPOT-HRG Data

Forest types mapping, is one of the most necessary elements in the forest management and silviculture treatments. Traditional methods such as field surveys are almost time-consuming and cost-intensive. Improvements in remote sensing data sources and classification –estimation methods are preparing new opportunities for obtaining more accurate forest biophysical attributes maps. This research co...

متن کامل

Support Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran

Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...

متن کامل

A Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset

Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...

متن کامل

جاسازی خط ویژگی وزن‌دار برای استخراج ویژگی تصاویر ابرطیفی

One of the most preprocessing steps before the classification of hyperspectral images is supervised feature extraction. Because obtaining the training samples is hard and time consuming, the number of available training samples is limited. We propose a supervised feature extraction method in this paper that is efficient in small sample size situation. The proposed method, which is called weight...

متن کامل

تعیین حجم نمونه در داده های دوتایی برای دو گروه مستقل در مطالعات پزشکی

Backgound and Objectives: Nowadats, it is very common to investigate the effect of a new method or medicine by using a comparison between two independent groups in many clinical studies, but unfortunately the sample size part has not been done based on scientific formulas but on practical issues in some researches of this type of cinical studies, and this makes some unreliable results. Therefor...

متن کامل

Rock typing and reservoir zonation based on the NMR logging and geological attributes in the mixed carbonate-siliciclastic Asmari Reservoir

Rock typing is known as the best way in heterogeneous reservoirs characterization. The rock typing methods confine to various aspects of the rocks such as multi-scale and multi-modal pore types and size, rock texture, diagenetic modifications and integration of static/dynamic data. Integration of static and dynamic behavior of rocks and their sedimentary features are practiced in this study. Po...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JCIT

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2009